Coding

Part:BBa_K4247014:Design

Designed by: Matteo Soana   Group: iGEM22_UCopenhagen   (2022-09-26)


Aneroin_NT


Assembly Compatibility:
  • 10
    COMPATIBLE WITH RFC[10]
  • 12
    COMPATIBLE WITH RFC[12]
  • 21
    INCOMPATIBLE WITH RFC[21]
    Illegal BamHI site found at 189
  • 23
    COMPATIBLE WITH RFC[23]
  • 25
    INCOMPATIBLE WITH RFC[25]
    Illegal AgeI site found at 232
    Illegal AgeI site found at 292
  • 1000
    COMPATIBLE WITH RFC[1000]


Design Notes

Aneroin is a composite part and codon optimization based on E.coli codon bias was performed on the sequence made available by Yang et al. (2013). Considerations for part design were essential due to the repetitive nature of the protein (GPGNTGYPGQ), therefore aneroin was therefore split into 3 basic parts that could be synthesised:

Aneroin_Nterm: containing a TrpL region and 10 repeats of the aneroin motif (as per original sequence, but two of these repeats are only a part of the repetitive motif). The TrpL region is a regulatory region and its transcripts make an alternate secondary structure that enables control of termination/attenuation at transcription level. Protein sequence: GPGNTGYPGQ - GPGNTGHPGQ - GPGNTGYPGQ - GNTGYPGQDP - GNTGYPGQDP - GNTGYPGQGP - GNTGCPGQGP - GNTGCPGQGP - GNTGYPGQGP - GNTG

Aneroin_middle: the middle part has eleven motif repeats similar to the first, but with slightly more variation. It also contains three partial motifs. Protein sequence: YPGQ - GPSNTGYPWQ - GPGNT - GPGNTGYPGQ - GPGNTGHPGQ - GPGNTGYPGQ - GNTGYPGQ - DPGNTGCPGQ - GPGNTGCPGQ - GSGNTGCPGQ - GSGNTGCPGQ - GPGQGPGNTG - YPG

Aneroin_Cterm: terminal region of the aneroin protein. It contains ten repeats of the motif and two partial repeats. We decided to place a 6-His Tag in this end (Ct) in order to facilitate purification, as per the original paper. Our purification tag was however part of the pet24(+) T7 expression plasmid and the final protein had, therefore, a small sequence (AGACTCGAG) between the end of the coding sequence and the beginning of the His-tag coding for three amino acids (RLE) was also expressed as part of the protein. Protein sequence: Q - GPGNTGHPGQ - GPGNTGYPGQ - DPGNTGYPGQ - DPGNTGCPGQ - GPGNTGCPGQ - GSGNTGCPGQ - GSGNTGCPGQ - GPGQ - GPGNTGYPGQ - GPSNTGYPGQ - GPGNTGYPGQ - GPGNTG

The parts were assembled using Golden Gate Cloning by adding BSAI recognition sites to this sequence so that they would reconstitute the original sequence with the other two basic parts.

In the original paper aneroin itself was used as a repeat, so that the final constructs would code for much bigger proteins and thus allow production of stronger fibres.



Source

The reference sequence is NCBI XP_001621085.1 while the annotated reference genome on NCBI is GCF_932526225.1. The information was obtained through Yang, Y., Choi, Y., Jung, D., Park, B., Hwang, W., Kim, H. and Cha, H., 2013. Production of a novel silk-like protein from sea anemone and fabrication of wet-spun and electrospun marine-derived silk fibers. NPG Asia Materials, 5(6), pp.e50-e50.


References

Yang, Y., Choi, Y., Jung, D., Park, B., Hwang, W., Kim, H. and Cha, H., 2013. Production of a novel silk-like protein from sea anemone and fabrication of wet-spun and electrospun marine-derived silk fibers. NPG Asia Materials, 5(6), pp.e50-e50.